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Abstract 

Background: Despite the widespread use of patient-reported Outcomes (PRO) in clinical studies, their design 
remains a challenge. Justification of study size is hardly provided, especially when a Rasch model is planned for 
analysing the data in a 2-group comparison study. The classical sample size formula (CLASSIC) for comparing 
normally distributed endpoints between two groups has shown to be inadequate in this setting (underestimated 
study sizes). A correction factor (RATIO) has been proposed to reach an adequate sample size from the CLASSIC 
when a Rasch model is intended to be used for analysis. The objective was to explore the impact of the parameters 
used for study design on the RATIO and to identify the most relevant to provide a simple method for sample size 
determination for Rasch modelling. 

Methods: A large combination of parameters used for study design was simulated using a Monte Carlo method: 
variance of the latent trait, group effect, sample size per group, number of items and items difficulty parameters. 
A linear regression model explaining the RATIO and including all the former parameters as covariates was fitted. 

Results: The most relevant parameters explaining the ratio's variations were the number of items and the variance 
of the latent trait (R 2 = 99.4%). 

Conclusions: Using the classical sample size formula adjusted with the proposed RATIO can provide a straightforward 
and reliable formula for sample size computation for 2-group comparison of PRO data using Rasch models. 

Keywords: Patient-reported outcomes, Item response theory, Rasch model, Sample size, Power 



Background 

Patient-reported outcomes (PRO) are increasingly used 
in clinical research; they have become essential criteria 
that have gained major importance especially in chronic- 
ally ill patients. Consequently, nowadays these outcomes 
are often considered as main secondary endpoints or 
even primary endpoints in clinical studies [1-4]. Two 
main types of analytic strategies are used for PRO data: 
so-called classical test theory (CTT) and models coming 
from Item Response Theory (IRT). CTT relies on the 
observed scores (possibly weighted sum of patients 
items' responses) that are assumed to provide a good 
representation of a "true" score, while IRT relies on an 
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underlying response model relating the items responses 
to a latent trait, interpreted as the true individual quality 
of life (QoL) for instance. The potential of IRT models 
for constructing, validating, and reducing questionnaires 
and for analyzing PRO data has been regularly under- 
lined [5-7]. IRT and in particular Rasch family models 
[8] can improve on the classical approach to PRO as- 
sessment with advantages that include interval measure- 
ments, appropriate management of missing data [9-11] 
and of possible floor and ceiling effects, comparison of 
patients across different instruments [12]. Consequently, 
many questionnaires are validated (or revalidated) using 
IRT along with CTT [13-15] allowing analysing PRO 
data with IRT models in clinical research. 

Clinical research methodology has reached a high level of 
requirements through the publication of international 
guidelines including the CONSORT statement, the STROBE 
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(Strengthening the Reporting of Observational Studies 
in epidemiology), or TREND (Transparent Reporting of 
Evaluations with Nonrandomized Designs), initiative for 
instance [16-19]. All of these published recommenda- 
tions are aimed at improving the reporting of scientific 
investigations coming either from randomized clinical 
trials or observational studies and systematically include 
an item related to sample size justification and deter- 
mination. Furthermore, good methodological standards 
recommend that methods used for sample size planning 
and for subsequent statistical analysis should be based 
on similar grounds. Even if guidelines have also been re- 
cently published for PRO based studies [20,21], the 
reporting of such studies often lacks mentioning the 
justification of study size and its computation. Three 
main types of situations are often encountered in 2- 
group comparison studies: i) sample size determination 
is not performed whatever the intended analysis for 
PRO data (CTT and/or IRT), ii) tentative justification is 
occasionally given a posteriori for the size of studies, iii) 
sample size computation is made a priori but only relies 
on CTT (mostly using the classical formula for compar- 
ing normally distributed endpoints on expected mean 
scores) even if IRT models are envisaged for data ana- 
lysis. In this latter case, previous studies have shown 
that the classical formula was inadequate for IRT models 
because it leads to underestimation of the required sample 
size [22]. From this perspective, a method has been re- 
cently developed for power and sample size determination 
when designing a study using a PRO as a primary end- 
point when IRT models coming from the Rasch family are 
intended to be used for subsequent analysis of the data 
[23]. This method, named Raschpower, provides the 
power for a given sample size during the planning stage of 
a study in the framework of Rasch models. It depends on 
the following parameters (that are a priori assumed and 
fixed): the parameters related to the items of the question- 
naire (items' number J and difficulties parameters 8j, j = 1, 
...J), the variance of the latent trait (a 2 ) and the mean dif- 
ference between groups on the latent trait (y). Some of 
these parameters are easily known a priori when planning 
a study (e.g. number of items) others are sometimes more 
difficult to reach (e.g. items difficulties, a 2 , y) and initial 
estimates based on the literature or pilot studies are re- 
quired. Besides, whether all these parameters have the 
same importance regarding sample size determination for 
Rasch models is unknown. The aim of our paper is to ex- 
plore the relative impact of these parameters on sample 
size computation and to identify the most relevant to be 
used during study design for reliable power determination 
for Rasch models. Our main objective is to provide a sim- 
ple method for sample size determination when a Rasch 
model is planned for analysing PRO data in a 2-group 
comparison study. 



Methods 

The Rasch model 

In the Rasch model [8], the responses to the items are 
modelled as a function of a latent variable representing 
the so-called ability of a patient measured by the ques- 
tionnaire (e.g. QoL, anxiety, fatigue...). The latent vari- 
able is often considered as a random variable assumed 
to follow a normal distribution. In this model, each item 
is characterized by one parameter (5j for the jth item), 
named item difficulty because the higher its value, the 
lower the probability of a positive (favourable) response 
of the patient to this item regarding the latent trait 
being measured. 

Let us consider that two groups of patients are com- 
pared and that a total of N patients have answered a 
questionnaire containing J binary items. Let X;j be a bin- 
ary random variable representing the response of patient 
i to item j with realization Xjj, 8 ; be the realization of the 
latent trait 0 for this patient, and y the group effect de- 
fined as the difference between the means of the latent 
trait in the two groups. 

For each patient, the probability of responding to each 
item is: 



exp{(6>j+g,y-fy) *'/} 
1+ exp(e i +g i y-dj) 
1, Nandj = 1, 



(1) 



J 



where 5j represents the difficulty parameter of item j 
and gi = 0,1 for patients in the first or second group, re- 
spectively. The latent variable 0 is usually a random 
variable following a normal distribution with unknown 
parameters u and a 2 . Marginal maximum likelihood esti- 
mation is often used for estimating the parameters of 
the model. 

Sample size determination in the framework of the Rasch 
model - The Raschpower method 

We assume that we want to design a clinical trial using 
a given dimension of a PRO (e.g. the Mental Health di- 
mension of the SF-36) as a primary outcome in a two- 
group cross-sectional study. Let y (assumed > 0) be the 
difference between the mean values of the latent trait 
(e.g. mental health) in the two groups and a 2 the com- 
mon variance of the latent trait in both groups. We as- 
sume that the study involves the comparison of the two 
hypotheses H 0 : y = 0 against the two-sided alternative 
Hi: y * 0. If we plan to use a Rasch model that includes 
a group effect y (Eq 1) to test this null hypothesis on the 
data that will be gathered during the study with a given 
power 1-[3 R and type I error a, determination of the re- 
quired sample size can be made using an adapted for- 
mula that has been implemented in the Raschpower 
method [23]. This method is based on the power of the 
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Wald test of group effect y for a given sample size and it 
is briefly described. To perform a Wald test, an estimate T 
of y is required as well as its standard error. Since we are 
designing a study, some assumptions are made regarding 
the expected values of these parameters. More specifically, 
T is set at the assumed value for the group effect, y, and its 
standard error is obtained as follows: an expected dataset 
of the patient's responses is created conditionally on the 
planning values that are assumed for the sample size in 
each group, the group effect y, the items difficulties 5j, and 
the variance of the latent trait a 2 . The probabilities and 
the expected frequencies of all possible response patterns 
for each group are computed with the statistical model 
that will be used for analyzing the data that will be gath- 
ered during the study: a Rasch model. The variance of the 
group effect Var(y) is subsequently estimated using a 
Rasch model including a group effect with 5j and cr 2 fixed 
to their planned expected values. 

The power 1-(3 R is then computed with the following 
formula: 

\ 

y 

z l-a/2 



l-p>l-0 



Var 



(2) 



where O is the cumulative standard normal distribution 
function and Z\ _ a /2 the percentiles of the standard normal 
distribution. 1 - p R is the power of the Wald test of group 
effect when a Rasch model is used to detect y at level a. In 
practice, y, cr 2 , and the items' difficulties are unknown 
population parameters and initial estimates based on the 
literature or pilot studies are required for calculations. 

Relationship between the Raschpower method and the 
classical formula for manifest normal variables 

Using the same notations as before (y is the group effect 
and cr 2 is the common variance of the latent trait for both 
groups), we can also compute the required sample size 
per group (N co for the first group and N cl for the second 
group) using the classical formula for comparing normally 
distributed endpoints with a given power 1-p and a type I 
error a to detect the group effect y as follows [24] : 



N: 



CO 



(k+ 1) x a 2 x (z^q/2-zi-p) " 
k x y 2 



(3) 



Where N C1 = k x N C o (when k = 1, the sample sizes are 
assumed equal in both groups). 

The power 1-p for detecting a difference between 
groups equal to y with a total sample size of Nco + Nci 
and a type I error set to a can also be computed as: 



1-fi 




<Dl / kNc ° X ^ -r 



(4) 



Let us assume without loss of generality that k = 1, that 
is we expect that the samples sizes are equal in each 
group (N co = N C i = N g ). It has been evidenced [23] that 
the sample size per group computed using this classical 
formula (N g ) allowed obtaining a power of 1-|3 at level a 
for CTT-based analysis but did not provide the same 
power for Rasch-based analysis, but a lower power, com- 
puted with the Raschpower method, namely 1-|3 R < 1-p 
(Figure 1, RP©). Thus, using this classical formula, the 
sample size required when a Rasch model is used has to 
be increased to reach the desired power of l-(3 (i.e. N g 
has to be increased). 

It has been observed in a previous study that this in- 
crease could be easily computed using the following 
relationships: 

- since l-p R < l-p\ the sample size that provides a 
power of 1-(3 R using the classical formula (Eq 3 and 
Figure 1, CF©), say N c , is lower than N g and the ratio 
Ra = t# (Figure 1, ®) is therefore higher than 1 

- previous observations [23] have shown that this ratio 
Ra remained stable for different values of N g and 1-|3 R , 
given y, J and items difficulties 

- it has been noticed that multiplying N g by this ratio 
gave a sample size of N R = N g x Ra (Figure 1, ©) that 
could provide the desired power 1-|3 for Rasch modelling 
(Figure 1, RP©) 

Hence this ratio Ra depends on the well-known clas- 
sical formula and can be used to provide sample size cal- 
culations for Rasch modelling. 

Simulations 

A simulation study has been performed in order to get 
more insight into the relationships between the parame- 
ters that are required when planning a study for power 
determination for a given sample size (y, cr 2 , 8j, J) and 
the ratio Ra. A large number of cases (10 6 ) were simu- 
lated with each case corresponding to a single parameter 
combination (y, cr 2 , 5j, J, N g ). The parameters values were 
randomly drawn from continuous or discrete uniform 
distributions, U[min-max], for: the variance of the latent 
trait a 2 (U[0.25-9]), the group effect y (U[0.2xa - 0.8xa]), 
the number of items J (U[3-20]), and the sample size per 
group N g assumed to be equal in both groups (U[50- 
500]). The items difficulty parameters 5j, j = 1, ...,], were 
drawn from a centred normal distribution with variance 
cr 2 and set to the percentiles of the distribution. The 
Raschpower method was applied on each parameter 
combination and provided the power l-p R for Rasch 
modelling as well as the ratio Ra. Multiple linear regres- 
sion was performed to assess the contribution of N g , y, J, 
and cr 2 and the difficulty parameters 8j, j = 1....J to the 
variation of the ratio Ra. The effects of the difficulty 
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Linear regression model 





A R 



N E xRa© 



N„ xRa = f 



1-P 



RP © 



N R 



An 



RP: Raschpower method; CF: classical formula for normally distributed variables 
N g : sample size per group associated with a power of 1-P R for Rasch analysis and 1-p for CTT- 
based analysis; y: difference between the mean values of the latent trait in the two groups 
(group effect); 5: vector of the items parameters; <f : variance of the latent trait; N c : sample 
size per group associated with a power of 1-(3 R for CTT-based analysis; Ra: ratio computed 
with the Raschpower method; Ra : ratio predicted by the linear regression model; 
A R = Ra - Ra ; N R : sample size per group computed with the Raschpower method associated 
with a power of 1-fS for Rasch analysis; N R : sample size per group predicted by the linear 
regression model; A N =N R -N R . 

J: number of items;, N R : sample size per group providing a 1-P power for Rasch analysis 

Figure 1 Description of the whole procedure for power and sample size determination using the ratio with the Raschpower method 
and the linear regression model. 



parameters on Ra were investigated in several ways for 
different values of J: i) by introducing each parameter in- 
dividually 5j, j = 1....J, ii) by introducing their mean and 
variance. A two-tailed P-value < 0.05 was considered sig- 
nificant. The variance explained by the model (R 2 ) and 
the root mean square error (RMSE) were obtained and 
contributed to variable selection. Variables were re- 
moved if R and RMSE remained stable (within a 0.01 
range). Post-regression diagnoses were performed to en- 
sure that all linear regression assumptions were met 
(normality and homoscedasticity of residuals). Statistical 
analysis was performed using SAS statistical software 
version 9.3 (SAS Institute Inc, Cary, North Carolina). 

Results 

Among the 10 6 parameter combinations, 15278 corre- 
sponded to the largest power for CTT and Rasch-based 
analysis, 100%, where the ratio cannot be computed. 



Hence all analyses were performed on 984722 parameter 
combinations. 

A full linear model explaining the value of Ra was first 
fitted including N g , y, 1/J, 1/cr, the difficulty parameters 

Table 1 Parameters estimates of the linear regression 
model explaining the ratio provided by the Raschpower 
method 



Variables 

Intercept 

1/o 2 

1/J 

Interaction (1/o 2 *1/J) 
R 2 

RMSE 



Npop = 984722 

1.012 (7.0 10~ 5 ) 
0.095 (1.0 10" 4 ) 
0.939 (5.0 1 0~ 4 ) 
3.730 (7.5 1 0~ 4 ) 

0.994 

0.030 



P-values 

<10~ 3 
<10~ 3 
<10~ 3 
<10~ 3 

/ 
/ 



Standard errors in parentheses, o : variance of the latent trait; J: number 
of items. 
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(included either individually or using their mean and 
variance) and their interactions. A backward procedure 
was used for variable selection relying on the R 2 and 
RMSE variations between models and not on p-values. 
Indeed, since the number of simulated combinations 
was high (984722), all parameters were significant but 
not necessarily meaningful (very small estimated values). 
The R 2 and RMSE remained stable during the backward 
procedure until the final model only containing 1/J and 
1/a 2 and their interaction was obtained (a maximum 
variation of 0.0015 and of 0.0037 was observed for the 
R 2 and the RMSE, respectively). The model that was 
retained can be written as follows: 

Ra i = P 0 +(p i x^ + ^ 2 xi) 

+ ( Ps X a 2 X 0 + £ ' (5) 

where e* ~ N (0, cr 2 ), for i = 1, 984722 

Table 1 shows the estimates of the multiple linear re- 
gression model that explains R = 99.4% of the variance 
of the ratio and displays high accuracy (RMSE = 0.030). 
The interaction between 1/a and 1/J is significant; the 
effect of 1/a on the ratio seems to be more pronounced 
when 1/J is large (i.e.: J is small). The ratio increases with 
1/a 2 (ie: when cr 2 decreases) and with 1/J (i.e.: when 
J gets smaller). 

The number of subjects per group predicted by this 
model was computed as follows: Nr = N g x Ra where 
Ra is the ratio predicted by the model. It was compared 



Table 2 Distributions of the difference between the ratio 
(respectively number of subjects per group) predicted by 
the model and the one expected by the Raschpower 
method A R (respectively A N ) and according to the 
threshold (Thres) for A N 



Variables N POP = 1996077 





2.5% / Median / 97.5% 




[min-max] 


Ar 


-0.049 / 0.002 / 0.043 




[-1.236 ; 0.230] 


A n 


-10.623 / 0.438/ 13.499 




[-179.576; 112.064] 




n (%) 


- Thres < A N < + Thres 


968364 (98.34%) 


A N < - Thres 


10865 (1.10%) § 


A N > + Thres 


5493 (0.56%) + 



Thres : threshold corresponding to 5% of the number of subjects per group 
derived from the Raschpower method. 

§ : underestimation of the number of subjects per group produced by the 
model as compared to the Raschpower method; i : overestimation of the 
number of subjects per group produced by the model as compared to the 
Raschpower method. 



to the expected number of subject per group N R = N g x 
Ra where Ra is the ratio derived from the Raschpower 
method. The difference between the ratio (respectively 
number of subjects per group) predicted by the model 
Ra (respectively Nr) and the one associated with the 
Raschpower method Ra (respectively N R ) was computed 



Distribution of DELTA N/N R 



-0.1 0.0 0.1 

DELTA N/N R 



Figure 2 Distributions of A N / N R with A N = N R N R , where N R is the number of subjects per group predicted by the linear regression 
model and N R is the number of subjects per group associated with the Raschpower method. 
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for all parameters combinations with Ar = Ra-Ra and 
An = Nr-Nr. Figure 2 shows the distributions of A N / 
N R which is distributed around 0. 

To quantify more precisely the magnitude of the dif- 
ference A N) a threshold (Thres) corresponding to 5% of 
the number of subjects per group expected with the 
Raschpower method, Thres = 0.05 x N R) was calculated 
for all parameters combinations. The descriptive statis- 
tics related to the distributions of A R , A N) and to A N 
with respect to Thres are displayed in Table 2. Ninety- 
five percent of the values of A R , (respectively A N ) lie be- 
tween -0.049 and 0.043 (respectively -10.623 and 
13.499). The largest overestimation (respectively under- 
estimation) of the number of subjects per group pre- 
dicted by the model is about 112 subjects per group 
(respectively -180 subjects per group). The distribution 
of A N mostly lies (98.34% of the cases) within the interval 
[-Thres - + Thres] corresponding to ±5% of the number 
of subjects per group expected with the Raschpower 
method. Moreover, the model rarely predicted (0.56%) an 
overestimated number of subjects per group of more than 
5% of the sample size per group expected with Rasch- 
power. This case only occurs when the variance of the la- 
tent trait a 2 < 1 and J > 7 items. An underestimated 
number of subjects per group of more than 5% of the 
sample size per group expected with Raschpower is occa- 
sionally observed (1.10%) and it mosdy occurs (more than 
90% of the cases) when J is larger than 16 items and 
mostly when a 2 < 1 (75% of the cases). 

The whole procedure including the Raschpower 
method and the linear regression model for power and 
sample size determination using the ratio is summarized 
in Figure 1. 



with others muscular dystrophies. A Rasch model in- 
cluding a group effect y was fitted on these data and its 
global fit was not rejected by the R im test (p = 0.329) 
[25] . The estimation of the difference between the means 
of the latent trait of the two groups was f = 0.649 and 
the estimated latent trait's variance was a 2 = 3.9323 
(non-significant difference between groups: p = 0.08). 
The objective was to use this pilot study to help plan- 
ning a future possibly larger study that would provide 
enough power to detect this difference on the latent trait 
using a Rasch model. Indeed, it seemed valuable to the 
clinicians to determine a sample size large enough to be 
able to significantly detect this difference considered as 
clinically relevant with a power of l-[3 = 90% using a 
Rasch model. The sample size per group computed 
using the classical formula (Eq 3), for detecting y = 
0.649 with a 90% power at a = 5%, assuming o 2 = 3.9323, 
is N g = 197 for CTT-based analysis. We know that N g 
has to be increased to reach the desired power for Rasch 
modelling using the ratio. The ratio predicted by the 
multiple linear regression model can be easily computed 
as follows using the values of J and d 2 : 



Ra= 1.012 



0.939 x 
1.27210 



0.095 x 

r 



1 

3.9323, 
3.730 x 



1 



1 



3.9323 8 



(6) 



Multiplying N g by this ratio gives a sample size of Nr = 
197 x 1.27210 ~ 251 patients per group that should pro- 
vide the desired power of 90% for Rasch modelling of the 
pain dimension of the NHP questionnaire. These results 
were compared to those obtained with the Raschpower 



An example of sample size determination in 
clinical research using the ratio - NHP data 

The data come from a pilot study whose main objective 
is to compare the pain level of two groups of patients 
having either Steinert's disease or another muscular dys- 
trophy. The two disease groups have similar symptoms 
but also present a number of dissimilar features such as 
pain, cognitive disorders or male hypogonadism that are 
more frequently encountered in patients suffering from 
Steinert's disease and may impact QoL. Since QoL and 
in particular pain assessment may help to better under- 
stand the burden of disease from the patients' perspec- 
tive and improving health outcomes and management, 
the pain dimension of the Nottingham Health Profile 
(NHP) questionnaire was used; it is composed of eight 
binary items (J = 8). The ethics committee of Reims, 
France granted approval for the study and patients were 
recruited in the university hospital of Reims: 52 patients 
were included with Steinert's disease and 95 patients 



Table 3 Comparison of the required parameters and the 
results obtained using the linear regression model and 
the Raschpower method on the NHP data 



Variables Linear regression model Raschpower method 



o 2 


3.9323 


3.9323 


J 


8 


8 


Y 


/ 


0.649 


Ng 


/ 


197 


6 


/ 


(2.61, 2.94, 1.75, 0.46, 
-0.11,0.36, 1.28, 2.23) 


Ra 


1.27210 


1.34 


Nr 


251 


264 



a 2 : variance of the latent trait; J: number of items; v: difference between the 
mean values of the latent trait in the two groups (group effect); N g : sample 
size per group providing a 1-p = 90% power with the classical formula and a 
1-|3 R = 80% power with Raschpower; 6: vector of the items parameters (6 1f 6 2 , 
&3, 64* 65- &e> S 7 , 6 8 ); Ra: ratio, N R : sample size per group providing a 1-(3 = 90% 
power for Rasch analysis with the linear regression model and Raschpower, /: 
not required. 



Sebille et ah BMC Medical Research Methodology 2014, 14:87 
http://www.biomedcentral.com/1471-2288/14/87 



Page 7 of 9 



method using the estimated difficulty parameters from the 
pilot study (2.61, 2.94, 1.75, 0.46, - 0.11, 0.36, 1.28, 2.23), 
y = 0.649, ff 2 = 3.9323, and N g = 197 per group. An 80% 
power (1-|3r) is expected using the Raschpower method 
for Rasch modelling with a sample size of N g = 197 per 
group (Figure 1, RP©). The proposed ratio is therefore 
equal to 197 / 147 = 1.34 where N c = 147 is the number 
of subjects per group that provides a power of 80% 
using the classical formula (Figure 1, CF©). Hence, 
using the ratio, 197 x 1.34 ~ 264 (N R ) patients per group 
should provide the desired power of 90% for Rasch 
modelling (Figure 1, RP©). 

The parameters that are required for the determin- 
ation of the ratio using the linear regression model or 
the Raschpower method as well as their corresponding 
values appear in Table 3. The ratio provided by the lin- 
ear regression model and Raschpower (Table 3) are close 
to one another (A R = -0.0679) and the number of sub- 
jects per group are |A N | = 13 patients apart. Moreover, 
since |A N | / N R = 13 / 264 = 0.0492, the linear model's 
prediction was within 5% of the expected sample size 
provided by the Raschpower method. 



Discussion 

Our results revealed that the sample size required in the 
framework of two-group cross-sectional studies for 



subsequent use of a Rasch model to analyse PRO data can 
be easily computed using the classical formula for com- 
paring normally distributed endpoints along with a correc- 
tion factor (named ratio in this paper). The most relevant 
parameters explaining this ratio's variation (R 2 = 99.4%) 
were the number of items of the questionnaire to be used 
in the study (J) and the latent traits variance (cr 2 ). Hence 
when designing a study, the most important parameters 
for reliable power determination using this ratio when a 
Rasch model is intended to be used to analyse PRO data 
appear to be the variance of the latent trait and the num- 
ber of items regardless of the values of the group effect (y) 
and items parameters (5j, j = 1....J). A preliminary investi- 
gation had already evidenced that the precision with 
which item difficulty parameters were known did not have 
an impact on power determination of the test of group ef- 
fect using a Rasch model [22]. However, in this previous 
study, the number of items J greatly impacted power as it 
was observed in our current study for sample size deter- 
mination; both (sample size and power) being very closely 
related. The power increased with J in line with what we 
observed in this study where the ratio decreased when J 
rose from 3 to 20 items, implying that fewer subjects were 
needed to obtain the same power when J = 20 as com- 
pared to J = 3. Moreover, this decrease of the ratio was 
more marked as a 2 got smaller (significant interaction be- 
tween 1/J and l/o 2 ). Quite a large range of values were 
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chosen for the variance of the latent trait (from 0.25 to 9) 
and for the number of items J (from 3 to 20) that allowed 
investigating more in depth the magnitude of their impact 
on the ratio. Figure 3 shows the evolution of the ratio Ra 
as a function of the number of items J according to the 
values of the variance of the latent trait a 2 . The effect of 
cr 2 on the ratio was large, especially for small values of the 
variance (a 2 < 1), the ratio increasing as a 2 decreased. This 
result, which might be thought as counter-intuitive, comes 
from the fact that the ratio, used to correct the sample size 
coming from the classical formula to obtain an adequately 
powered Rasch model, is a measure of the distance be- 
tween the sample sizes corresponding to the powers ex- 
pected for CTT and Rasch-based analyses. This distance 
becomes larger as the variance gets smaller and it reaches 
its maximum when cr 2 < 1. Hence, the correction factor 
(ratio) is likely to get larger as cr 2 decreases and the dis- 
tance between the sample sizes for CTT and Rasch in- 
creases. Furthermore, it can be noted that when a 2 < 1, the 
linear regression model could predict an overestimated 
number of subjects per group of more than 5% of the 
sample size per group expected with Raschpower (in at 
most 0.56% of all parameters combinations). An under- 
estimation of more than 5% of the sample size per group 
expected with Raschpower could also be noticed (in at 
most 1.10% of all parameters combinations) for small 
values of a 2 (cr 2 < 1 in 75% of the cases) and large values of 
J (J > 17 in more than 90% of the cases). It can be empha- 
sized that such small variances for the latent trait might 
be rarely encountered in practice especially when J is large 
[26,27]; hence this simple regression model should be reli- 
able for sample size determination in most situations usu- 
ally found in clinical research. Nevertheless, one of the 
major issues regarding study design and sample size deter- 
mination still remains: to what expected values should we 
fix the key parameters? In our case, the challenge is put 
on one single parameter, the expected value for the vari- 
ance of the latent trait. Retrospective, pilot data or pub- 
lished studies can be used for that purpose to provide 
information regarding the plausible range of values for the 
variance. However, it can turn out to be problematic if no 
previous studies can provide this information and it seems 
important to further study the impact of misspecifications 
of the planning values for the variance on the performance 
of the proposed method for sample size determination for 
Rasch modelling. 

The fact that the number of subjects given by the clas- 
sical formula, based on the latent trait, has to be in- 
creased using the ratio to reach the expected power for 
Rasch modelling could deliver a wrong message. Indeed, 
it could be interpreted as if Rasch models required more 
subjects than CTT-based analyses would. In fact, the 
classical formula is directly computed from the expected 
difference between the latent traits in both groups and 



the latent trait's variance in each group, assumed to be 
equal. By doing so, we assume that the means and vari- 
ance of the latent traits are "perfectly" known and thus 
do not take into account the fact that the latent trait is 
not an observed (manifest) variable. Hence, its estima- 
tion requires the use of a model which creates uncer- 
tainty, unlike scores that can be directly observed and 
measured. This uncertainty is taken into account by 
adjusting the sample size using the ratio to obtain an ad- 
equately sized study for Rasch modelling. Moreover, it 
has been underlined that the so-called effect size (differ- 
ence in means over the standard deviation) on the score 
scale was lower than the corresponding effect size on 
the latent trait scale. Consequently, the sample size re- 
quested for CTT-based analysis using the effect size on 
the score scale is higher than its counterpart on the la- 
tent trait scale. 

The proposed method can be used with confidence 
when J stands between 3 and 20 and especially when the 
variance of the latent trait is expected to be higher than 1. 
Otherwise (when a 2 < 1), the Raschpower method should 
be preferred since the ratio-based approach might under 
or overestimate the sample size. One of the limitations of 
our study is that we focused on one of the most well- 
known IRT model, the Rasch model. The Raschpower 
method has also been developed for other models that are 
well suited for the analysis of polytomous item responses, 
such as the Partial Credit Model or the Rating Scale 
Model (Hardouin, under revision). Moreover, the Rasch- 
power method has recently been extended to deal with 
longitudinal designs [28] and it might be expected that 
this ratio would also be worthwhile in these contexts. Fi- 
nally, the Raschpower method (for dichotomous and poly- 
tomous items and for cross-sectional and longitudinal 
designs) and the ratio-based approach (for dichotomous 
items) have been implemented in the free Raschpower 
module available at the website PRO-online http://pro- 
online.univ-nantes.fr. 

Conclusion 

Using the classical formula for normally distributed end- 
points along with the proposed ratio only depending on 
the number of items and the variance of the latent trait 
can provide a straightforward and reliable formula for 
sample size computation for subsequent Rasch-based 
analysis of PRO data. 
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